Robust distant speaker recognition based on position-dependent CMN by combining speaker-specific GMM with speaker-adapted HMM
نویسندگان
چکیده
In this paper, we propose a robust speaker recognition method based on position-dependent Cepstral Mean Normalization (CMN) to compensate for the channel distortion depending on the speaker position. In the training stage, the system measures the transmission characteristics according to the speaker positions from some grid points to the microphone in the room and estimates the compensation parameters a priori. In the recognition stage, the system estimates the speaker position and adopts the estimated compensation parameters corresponding to the estimated position, and then the system applies the CMN to the speech and performs speaker recognition. In our past study, we proposed a new text-independent speaker recognition method by combining speaker-specific Gaussian mixture models (GMMs) with syllable-based HMMs adapted to the speakers by MAP [Nakagawa, S., Zhang, W., Takahashi, M., 2004. Text-independent speaker recognition by combining speaker-specific GMM with speaker-adapted syllable-based HMM. Proc. ICASSP-2004 1, 81– 84]. The robustness of this speaker recognition method for the change of the speaking style in close-talking environment was evaluated in (Nakagawa et al., 2004). In this paper, we extend this combination method to distant speaker recognition and integrate this method with the proposed position-dependent CMN. Our experiments showed that the proposed method improved the speaker recognition performance remarkably in a distant environment. 2007 Elsevier B.V. All rights reserved.
منابع مشابه
Robust distant speaker recognition based on position dependent cepstral mean normalization
In a distant environment, channel distortion may drastically degrade speaker recognition performance. In this paper, we propose a robust speaker recognition method based on position dependent Cepstral Mean Normalization (CMN) to compensate the channel distortion depending on the speaker position. It is shown in [1] that the position dependent CMN is robust for speech recognition in a distant en...
متن کاملText-independent speaker recognition by speaker-specific GMM and speaker adapted syllable-based HMM
We present a new text-independent speaker recognition method by combining speaker-specific Gaussian Mixture Model(GMM) with syllable-based HMM adapted by MLLR or MAP. The robustness of this speaker recognition method for speaking style’s change was evaluated. The speaker identification experiment using NTT database which consists of sentences data uttered at three speed modes (normal, fast and ...
متن کاملRobust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN
In a distant-talking environment, the length of channel impulse response is longer than the short-term spectral analysis window. Conventional short-term spectrum based Cepstral Mean Normalization (CMN) is therefore, not effective under these conditions. In this paper, we propose a robust speech recognition method by combining a short-term spectrum based CMN with a long-term one. We assume that ...
متن کاملRobust Speech Recognition in Distant Environment Based on Speaker Position and Speaking Direction Detection
In a practical environment, channel distortion may severly degrade speech recognition performance. In this paper, we propose a robust speech recognition method using real-time Cepstral Mean Normalization (CMN) [1] based on speaker position and speaking direction detection. We first estimate the speaker position in a 3-D space based on the time delay of arrival (TDOA) between distinct microphone...
متن کاملUser-customized Password Speaker Verif Gmm Model
In this paper, we present a new approach towards user-customized password speaker verification combining the advantages of hybrid HMM/ANN systems, usingArtificial Neural Networks (ANN) to estimate emission probabilities of Hidden Markov Models , and Gaussian Mixture Models. In the approach presented here, we indeed exploit the properties of hybrid HMM/ANN systems, usually resulting in high phon...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Speech Communication
دوره 49 شماره
صفحات -
تاریخ انتشار 2007